data science toolkit
Here's How Publishers Are Opening Their Data Science Toolkits to Advertisers
As publishers grapple with how to best make use of the troves of audience data at their disposal, a growing number are handing brands the keys to in-house data and artificial intelligence tools that could change the way ads and sponsored content are sold. The New York Times, Group Nine Media and the Washington Post are among the media companies that have taken advantage of data science projects built for editorial purposes to give advertisers a clearer picture of who's consuming their content and how to best speak to them. Publishers hope programs like these might help them gain back ground from tech giants like Facebook and Google that dominate the ads industry through targeting precision. The Times debuted a unit earlier this year called nytDEMO that encompasses two new data-crunching tools. One, called "Project Feels," is meant to gauge and analyze readers' emotional reaction to articles and videos through a crowdsourced survey tool.
3 ideas to add to your data science toolkit
I'm always on the lookout for ideas that can improve how I tackle data analysis projects. I particularly favor approaches that translate to tools I can use repeatedly. Most of the time, I find these tools on my own--by trial and error--or by consulting other practitioners. I also have an affinity for academics and academic research, and I often tweet about research papers that I come across and am intrigued by. Often, academic research results don't immediately translate to what I do, but I recently came across ideas from several research projects that are worth sharing with a wider audience.
The Data Science Toolkit - My Boot Camp Ciriculum
This is a compilation has everything you need to jumpstart your skills in the core tasks of data transformation, modeling, and visualization. MODELING Below is a list of popular analysis from Rexer's 2013 survey. The table is biased towards customer transaction, text, and social media data. CRAN has pages dedicated to each typical task of statistical computing http://cran.r-project.org/web/views/ Python has several packages tailored for statistical analysis including Pandas, Orange, PyBrain and Scikit-learn TRANSFORMATION OpenRefine is designed to help journalists and other non technical people organize incomplete data from different sources.
Machine Learning: Why Now? Your questions answered here and at #StrataHadoop
Machine learning is not new. SAS has been doing it for over 20 years and some early machine learning papers date back to the 50's. So why is it one of the hottest topics at the Strata Hadoop World conference later this week? Clearly, Hadoop is playing a major role in the increased focus on machine learning. Patrick Hall is a Senior Machine Learning Scientist at SAS.
Machine Learning: Why Now? Your questions answered here and #StrataHadoop
Machine learning is not new. SAS has been doing it for over 20 years and some early machine learning papers date back to the 50's. So why is it one of the hottest topics at the Strata Hadoop World conference later this week? Clearly, Hadoop is playing a major role in the increased focus on machine learning. Powerful, low-cost distributive computing environments coupled with Hadoop give data scientists the ability to run iterative models (like neural networks) they may not have been able to in the past.